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Abstract Introduction 



The ordinal regression method was used to model the 
relationship between the ordinal outcome variable, e.g., 
different levels of student satisfaction regarding the overall 
college experience, and the explanatory variables 
concerning demographics and student learning environment 
in a predominantly minority health sciences center. The 
outcome variable for student satisfaction was measured 
on an ordered, categorical, and four-point Likert scale — 
‘very dissatisfied’, ‘dissatisfied’, ‘satisfied’, and ‘very 
satisfied’. Explanatory variables included two 
demographics, e.g., gender and ethnic groups, and 42 
questionnaire items related to the satisfaction of faculty 
involvement, curriculum contents, support services, 
facilities, and leisure activities at the college. The major 
decisions involved in the model building for ordinal 
regression were deciding which explanatory variables 
should be included in the model and choosing the link 
function (e.g., logit link or complementary log-log link) 
that demonstrated the model appropriateness. In addition, 
the model fitting statistics, the accuracy of the classification 
results, and the validity of the model assumption, e.g., 
parallel lines, were essentially assessed for selecting the 
best model. The research findings indicated that explanatory 
variables such as faculty competence and student-faculty 
relations were significantly associated with the satisfaction 
of the overall college experience. This discovery suggests 
that faculty members have played a major role in creating 
a pleasant environment to facilitate student satisfaction. 
In addition, the curriculum content regarding health 
promotion and disease prevention was significantly 
associated with the satisfaction of the overall college 
experience. It may also provide strong evidence that a 
specific component of the medical curriculum addressed 
student needs and contributed to the fulfillment of the 
medical college goal, e.g., delivery of primary care through 
health promotion and disease prevention. 



There has been an increasing emphasis on the study 
of student satisfaction in colleges and universities in 
America based on the notion that students have needs 
and rights to participate in quality programs and to receive 
satisfactory services. The satisfaction surveys provide 
colleges and universities with real pictures of the key 
issues perceived by their students. Consequently, the 
satisfaction results from the questionnaire surveys have 
been used as feedback information to help college 
administrators and faculty enhance the quality of programs 
and services. 

Different statistical methods used to analyze satisfaction 
data yield results with different focuses. These methods 
include descriptive statistics, chi-square, linear regression 
analysis, multilevel modeling, and ordinal regression 
techniques. Descriptive statistics, e.g., means, frequencies, 
and proportions of student responses are often applied to 
detect the most and the least satisfaction items regarding 
college programs and services (Cooney, 2000; Damminger, 
2001; and Wild, 2000). Chi-square method is used to 
identify the significant proportion difference for student 
satisfaction response based on student retention group 
(Bailey, Bauman, and Lata, 1998). 

Regression methods such as linear, logistic, and ordinal 
regression are useful tools to analyze the relationship 
between multiple explanatory variables and student 
satisfaction results (Thomas and Galamos, 2002; and 
Hummel and Lichtenberg, 2001 ). The regression methods 
are capable of allowing researchers to identify explanatory 
variables related to academic programs and services that 
contribute to the overall college satisfaction. These methods 
also permit researchers to estimate the magnitude of the 
effect of the explanatory variables on the outcome variable. 
Therefore, regression methods seem to be superior in 
studying the relationship between the explanatory and 
outcome variables. Despite the prevalence of linear and 
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logistic regression analyses, researchers are experiencing 
the challenge of using ordinal regression analysis to 
study the ordinal outcome because in part, they have not 
been fully exposed to the mathematical theory and the 
application software. Nowadays, the availability of statistical 
software routines in the Statistical Package for the Social 
Sciences (SPSS) or the Statistical Analysis System 
(SAS) makes it computationally possible to build an 
ordinal regression model. 

The application of linear, logistic, and ordinal regression 
methods depends largely on the measurement scale of 
the outcome variables and the validity of the model 
assumptions. The outcome variables include continuous 
scale, (e.g., total satisfaction scores), binary measure 
(e.g., satisfaction and dissatisfaction ratings), or ordered 
category (e.g., very dissatisfied, dissatisfied, satisfied, 
and very satisfied). Linear regression analysis is applicable 
to the outcome variable measured on a continuous scale 
while logistic regression analysis works well only for the 
binary or dichotomous outcome. In linear and logistic 
regression analyses, the model assumptions of normality 
and constant variance for the residual and the outcome 
data points need to be satisfied to demonstrate their 
appropriateness. If researchers wish to study the effects 
of explanatory variables on all levels of the ordered 
categorical outcome, an ordinal regression method must 
be appropriately chosen to obtain the valid results. More 
examples of ordinal outcomes include certain psychological 
measurement (e.g., levels of anxiety or depression), rank 
scores (e.g., letter grades of the course work), and the 
most frequently used Likert-scale (e.g., “poor”, “fair”, “good”, 
and “excellent” ratings). It is implausible to assume the 
normality and homogeneity of variance for ordered 
categorical outcome when the ordinal outcome contains 
merely a small number of discrete categories. Thus, the 
ordinal regression model becomes a preferable modeling 
tool that does not assume the normality and constant 
variance, but require the assumption of parallel lines 
across all levels of the categorical outcome. 

The step-by-step procedures for building, evaluating, 
and interpreting the ordinal regression model were illustrated 
in this study. Essentially, the study followed four 
sequential protocols to create a workable model. First of 
all, the potential explanatory variables were examined to 
determine if they should be included in the model. Second, 
the outcome variable was coded or labeled as ordered, 
ranked, and categorical values. The explanatory variables 
were either a continuous or a discrete measure. Third, the 
complete and the reduced models along with the logit link 
and the complementary log-log (clog log) link were used to 
generate the candidate models. The complete model 
contained all the explanatory variables while the reduced 
model included a subset of the predetermined explanatory 
variables. The logit and the cloglog links were chosen to 
build models based on the distribution of ordinal outcome. 



either evenly distributed among all categories or clustered 
around higher categories. Finally, the best model was 
chosen among all candidate models based on the model 
fitting statistics, the accuracy of the classification results, 
the validity of the model assumption, and the principle of 
parsimony. Clearly, the ordinal regression is a unique 
modeling technique in that the outcome variable is measured 
on the ordered categorical scale, various link functions are 
readily available to apply, and the validity of the model 
assumption for parallel lines is essentially assessed. 

In this study, the ordinal regression model was 
constructed to explore and examine the relationship 
between the satisfaction of overall college experience and 
the explanatory variables concerning demographics and 
the satisfaction ratings of student learning environment. 
The study results could lead to a better understanding of 
the satisfaction of college programs and services from 
student perspectives. The research question might be 
formulated as “How well can the satisfaction of the overall 
college experience be accounted for by the explanatory 
variables concerning college-learning environment?” The 
outcome variable of interest was the satisfaction of overall 
college experience, with a four-level ordinal measure such 
as “very satisfied”, “satisfied”, “dissatisfied”, and “very 
dissatisfied”. Explanatory variables included two 
demographics, e.g., gender and ethnic groups, and 42 
satisfaction questionnaire items related to faculty 
involvement, curriculum contents, support services, 
facilities, and leisure activities in college. Using the ordinal 
regression method, researchers could identify the significant 
explanatory variables with their control to enhance student 
satisfactions regarding college-learning environment. The 
ultimate goal of the study was to make recommendations 
to enhance faculty involvement, curriculum contents, and 
support services as appropriate in the light of the research 
findings. 

Literature Review 

For decades, researchers in higher education have 
assessed student satisfaction in three different justifications. 
First, most researchers have measured solely the levels 
of student satisfaction in order to identify the most and the 
least satisfaction with college programs and services for 
accountability reporting and self-improvement purposes. 
Secondly, some researchers have examined student 
satisfaction to see if satisfaction ratings of college programs 
and services associate with the satisfaction of the overall 
college experience. Lastly, few researchers have 
investigated student satisfaction items related to the 
occurrence of the educational events such as student 
retention and attrition. 

To obtain various satisfaction results, different statistical 
methods such as descriptive statistics, chi-square, linear 
regression, multilevel modeling, and ordinal regression 
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techniques have been commonly found in the literature to 
analyze student satisfaction questionnaires. Descriptive 
statistics were extensively used to detect the most and 
the least satisfactory items that students had experienced 
with their college programs and services. For instance, 
the mean responses of student satisfaction survey 
conducted by Noel-Levitz Company revealed community 
college student satisfaction. The survey respondents rated 
highest satisfaction on responsiveness to diverse 
populations, registration effectiveness, and academic 
services, while rating the lowest satisfaction on admissions 
and financial aid, academic advising, and campus support 
services (Cooney, 2000). Using percentages, means, 
modes, and qualitative written reports, student satisfaction 
with the quality of integrated academic and career advising 
was summarized. The study results indicated that most 
students were “extremely satisfied” or “very satisfied” with 
their combined academic and career counseling service 
(Damminger, 2001 ). An additional example of making use 
of descriptive statistics was to compare student satisfaction 
via frequency distribution between two campuses within a 
university. The study (Wild, 2000) showed that 1 4 percent 
of student respondents chose ‘very satisfied’ ratings in the 
areas of access to information and student orientation, 
respectively. Students on both campuses highly rated 
staff helpfulness, financial aid staff, campus safety and 
access to computers, while expressing dissatisfaction 
with off-hours access to registration and the bookstore. 

Chi-square, linear regression, and multilevel modeling 
techniques were generally utilized to investigate the 
association between the explanatory variables and the 
outcome variable such as student retention and overall 
satisfaction with academic programs and services. Cross- 
tabulation and chi-square techniques were used (Bailey, 
Bauman, and Lata, 1998) to predict college student 
retention based on satisfaction. A strong relationship 
between student satisfaction and retention was found on 
40 of the 68 questions (59%). Using linear regression and 
decision tree analysis with the chi-squared automatic 
interaction detector (CHAID) software program, a study 
(Thomas and Galamos, 2002) compared student 
satisfaction responses between academically- and non- 
academically-oriented student groups. The research 
results demonstrated that faculty preparedness, social 
integration, and pre-enrollment opinions emerged as the 
most important variables contributing to student satisfaction 
for both groups. Linear regression methods were used to 
investigate the relationship between student satisfaction 
and medical school learning environment (Robins, et al, 
1 997). The study results provided evidence that curriculum 
structures, (e.g., timely feedback and promotion of critical 
thinking) were prominent explanatory variables. Using a 
multilevel modeling technique to analyze survey data, one 
study (Umbach and Porter, 2001) examined the impact 
that different departments have on student satisfaction in 



a large research university. The research finding revealed 
that characteristics of departments such as size, faculty 
contact with students, research emphasis, and proportion 
of female students had a significant impact on education 
satisfaction within major. 

By utilizing an ordinal regression model, a newly 
implemented study (Hummel and Lichtenberg, 2001) was 
used to estimate the probabilities of the four ordinal 
categories (“worse”, “can’t tell”, “better”, and “much better”) 
of client improvement in a counseling center. The research 
findings showed that the five explanatory variables 
significantly associated with the probability of an outcome 
category. These variables included previous experience 
as a client; readiness to change; level of symptomatic and 
interpersonal distress; pre-counseling clinical status; and 
the number of counseling sessions in which a client might 
be involved. 

Based on the literature review, one might conclude that 
descriptive statistics (e.g., means, percentages, and 
frequency counts), chi-square (e.g., cross-tabulation, 
Pearson’s chi-square test, decision tree with CHAIDS 
software program), linear regression, and multilevel 
modeling approaches were increasingly utilized to study 
student satisfaction in relation to various explanatory 
variables. However, compared to these study methods, 
the ordinal regression method seems to be the most 
suitable and practical technique to analyze the effects of 
multiple explanatory variables on the ordinal outcome that 
cannot be assumed as continuous measure and normal 
distribution. Researchers do not need to alter an ordinal 
outcome as binary or dichotomous measure for logistic 
regression analysis, which may lead to the loss of inherent 
information. Although the ordinal regression analysis is 
currently underused in the field of education, several articles 
were found in the medical field, which illustrated the 
foundation of the mathematical model and made use of 
the ordinal regression. 

In ordinal regression analysis, the two major link 
functions, e.g., logit and cloglog links, are used to build 
specific models. There is no clear-cut method to distinguish 
the preference of using different link functions. However, 
the logit link is generally suitable for analyzing the ordered 
categorical data evenly distributed among all categories. 
The cloglog link may be used to analyze the ordered 
categorical data when higher categories are more probable 
(SPSS, Inc., 2002). 

The ordinal regression model may be written in the form 
as follows if the logit link is applied, f [y. (X)] = log { y (X) 
/[I - y. (X)]}= log {[ P(Y<y. | X)] / [P(Y >y.'| X)]}= a. + pk, j 
= 1,2, ..., k - 1, and y^ (x) = e <®. + w/ [ 1 + e (a. + px)], 
where j indexes the cut-off points for all categories (k) of 
the outcome variable. If multiple explanatory variables are 
applied to the ordinal regression model, BX is replaced by 
the linearcombinationof P^X, H- PgXg-H... n-P^Xp (Bender and 
Benner, 2000). The function f [y^ (X)] is called the link 
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function that connects the systematic components (i.e.aj 
+ PX) of the iinear modei (Giii, 2001). The alpha a. 
represents a separate intercept or threshold for each 
cumulative probability. The threshold (aj) and the regression 
coefficient (P) are unknown parameters to be estimated 
by means of the maximum likelihood method. 

The name of the logit link can be traced back to the 
logistic regression function where the odds of event 
occurrence is defined as a ratio of the probability of event 
occurrence to the probability of event non-occurrence, 
e.g., y(X) /[I- y(X)] = e + The log (odds), e.g., log 
{ Y(X)/[1-y(X)]} is called the logit, which equals the linear 
form of a -I- PX (Hosmer and Lemeshow, 1 989). Notice that 
the ordinal regression model is called the cumulative logit 
model because the model is built based on the cumulative 
response probabilities y. (X) of being in category (j) or 
lower given the known explanatory variable (Walters, et al. 
2001). The ordinal regression model with the logit link is 
also known as the proportional odds model because the 
regression coefficient (e.g., log odds) is independent of 
the category (Bender and Benner, 2000). A part of Table 
1 below showes that the cumulative response probabilities 
were calculated for ordinal regression equations in the 
logit link. 

In constructing the ordinal regression model, an 
alternative choice to the logit link is the cloglog link 
function. The ordinal regression model may be written in 
the following form if the cloglog link is used to create the 
model. 

f [T (X)] = log { -log [ 1- Yj (X) ] } = log { -log [P(Y =y^ 
|X)/ P{Y >yJX)] }=a. + pX, 

(a. + pX) 

and Yj(X)= 1 - e"® , where j = 1, 2, ..., k - 

1 and j indexes the cut-off points for all categories of the 
outcome variable. Again, if multiple explanatory variables 
are involved in the ordinal regression model, the linear 
combination of P^X^ -i- P^X^-i-. .. -i- P^X^ is substituted for BX 
(Bender and Benner, 2000). The term of the complementary 
function comes from [1- Yj(X)]. Thus, the name of the 
complementary log-log link function is derived from log {- 
log [I - Yi(X)]} which equals to the linear form of a. + px. 
The ordinal regression model with the cloglog link is 
called the continuation ratio model because it is a ratio of 
the two conditional probabilities, e.g., P(Y =Yj| X) to P(Y 
>Yj I X). The model with the cloglog link is also called the 
proportional hazard model because the relationship 
between the explanatory variables and the ordinal outcome 
is independent of the category (Bender and Benner, 2000). 
The other part of the Table 1 shows that the response 
probabilities were calculated for the ordinal regression 
equations in the cloglog link. 

The essential features of the ordinal regression model 
regardless of any link function may be briefly described. 
First, the outcome variable of interest is a grouped and 
ordered category that may be regrouped from an unobserved 



Table 1 

Comparison of the Probability Calculations 
between the Logit Link and Cloglog Link Based 
on a 4-Category* of the Ordinal Outcome 



Cut-off Point or 
Equation Q) 


Cumulative Logit Model or 
proportional Odds Model or Logit 
Link log{[P(Y=y.|X)]/[P(Y>yJX)]} 


Continuation Ratio Model or 
Proportional Hazard Model or Cloglog 
Link log{-log[P(Y=yJX)/P(Y>yJX)]} 


1. 


Category 1 vs. Categories 2, 3, 4 


Category 1 vs. Categories 2, 3, 4 


2. 


Categories 1 , 2 vs. Categories 3, 4 


Category 2 vs. Categories 3, 4 


3. 


Categories 1 , 2, 3 vs. Category 4 


Categories 1 , 2, 3 vs. Category 4 


* Examples of Satisfaction Measure 






Category 1 = Very Dissatisfied 


Category 3 = Satisfied 




Category 2 = Dissatisfied 


Category 4 = Very Satisfied 



continuous latent variable (Scott, et al., 1 997). However, 
it is not clear whether the ordinal outcome is equally 
spaced. Second, the ordinal regression analysis employs 
a link function to describe the effect of the explanatory 
variables on ordered categorical outcome in such a way 
that the assumptions of normality and constant variance 
are not required (McCullagh and Nelder, 1 989). Third, the 
model assumes that the relationship between the 
explanatory variables and the ordinal outcome is 
independent of the category because the regression 
coefficient does not depend on the categories of the 
outcome variable. In other words, the model assumes that 
the corresponding regression coefficients in the link function 
are equal for each cut-off point (Bender and Benner, 2000). 
Hence, the violation of the model assumption ‘parallel 
lines’ has to be verified carefully by the test of parallel lines 
(SPSS, Inc., 2002). 

It is interesting to note that the ordinal regression 
model with the logit link has the property of invariance. If 
the outcome variable (Y) is coded in the reversed order, 
the signs of regression coefficients will be changed in the 
opposite direction (Greenlan, 1 994; Walters, et al. 2001 ). 
Based on the characteristics of invariance in the logit link, 
the study results of the ordinal regression analysis would 
not be affected by the direction of the coding scheme. 

Methodology 

Being quality conscious and student oriented, 
administrators and faculty are generally concerned with 
the quality of programs and services that they offered to 
students. A graduating student questionnaire has been 
annually conducted as part of an ongoing evaluation process 
to solicit student perceptions concerning programs and 
services. Hence, survey data were collected for all 
graduating medical students during years 1 999-2001 . The 
satisfaction items were parts of the entire questionnaire 
that allowed graduating students to report their own 
satisfaction regarding college-learning environment. The 
questionnaire responses were summarized into the relative 
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frequency distribution and submitted to the schooi dean 
for decision-making purposes. With a different and in- 
depth focus, the ordinai regression anaiysis was performed 
to gain insight into how individuai items were associated 
with the overail college satisfaction. The PC-based version 
11.0 of the SPSS commands was used to perform the 
ordinal regression analysis. The graduating student 
questionnaire was considered to be an observational and 
cross-sectional rather than an experimental study. This 
study did not engage in the randomization of assigning 
treatment or control to students, nor did it involve the 
manipulation of any treatment or variable to observe the 
group differences. The questionnaire items consisted of 
the student satisfaction for the overall college experience 
(e.g., outcome variable), and two demographics such as 
gender and ethnic groups, and the 42 satisfaction items 
(e.g., explanatory variables). 

The 42 explanatory variables were interrelated and 
classified into the five pre-determined factors — faculty 
involvement, curriculum contents, support services, 
facilities, and leisure activities in college. Factor I - faculty 
involvement included items such as accessibility to faculty, 
faculty competence, faculty attitude toward students, 
quality of instruction, student-faculty relations, and 
instruction/course evaluation. Factor II - curriculum contents 
incorporated psychological factors in health/illness, cultural 
factors in disease development, medical ethics, health 
promotion/disease prevention, HIV/AIDS, clinical skills, 
communication skills, interpersonal skills, computer skills, 
and research skills. Factor III - support services referred to 
admission and registration, financial aid services, library 
services, tutorial program, board review program, personal 
counseling, and career counseling. Factor IV - facilities 
covered classroom facilities, laboratory facilities, housing 
facilities, and parking. Factor V - leisure activities in 
college was composed of student recreation, cultural 
events, and social events. 

Gender was coded 1 for males and 0 for females while 
ethnicity was coded 1 for African American and 0 for non- 
African American. The 42 questionnaire items used a five- 
point Likert scale: 0 for being “not applicable”, 1 for being 
“very dissatisfied”, 2 for being “dissatisfied”, 3 for being 
“satisfied”, and 4 for being “very satisfied”. The high internal 
consistency for the survey instrument might be 
demonstrated based on the alpha reliability — all items 
combined 0.89 (42 items); faculty involvement 0.87 (10 
items); curriculum contents 0.81 (10 items); support 
services 0.82 (17 items); facilities 0.42 (3 items); and 
leisure activities 0.75 (2 items). 

The primary focus of the study was the formulation of 
the ordinal regression model, the application of ordinal 
regression analysis, and the interpretation of study results. 
The student satisfaction questionnaire was analyzed by 
the ordinal regression method to achieve the four study 
objectives: (a) to identify significant explanatory variables. 



i.e., satisfaction items, in the five-item factors that 
influenced the overall college satisfaction; (b) to estimate 
thresholds (i.e., constants) and regression coefficients; 

(c) to describe the direction of the relationship between 
the explanatory variables and the overall college satisfaction 
based on the sign (-i- and -) of regression coefficients; and 

(d) to perform classifications for all satisfaction levels of 
the overall college experience, and subsequently evaluate 
the accuracy of the classification results. 

The major decisions involved in constructing the ordinal 
regression models were deciding what explanatory variables 
to include in the model equation and choosing link functions 
that would be the best fit to the data set. Two commonly 
used link functions, e.g., logit link and cloglog link, were 
chosen to build the ordinal regression models. If the 
frequency distribution of the ordered categorical outcome 
exhibited that the data points were evenly distributed in 
various categories, then the logit link function might be 
appropriate. If the frequency distribution of the ordered 
categorical outcome showed that a large percent of student 
respondents were in higher categories such as very 
satisfied and satisfied ratings, then the cloglog link function 
might be suitable. In fact, there was no clear-cut choice 
of link functions. If one link function did not provide a good 
fit to the data, then the other link function might be a viable 
alternative. As a result, it was worth trying the alternative 
link function to see if the model turned out to be the better 
one. In addition, the model assumption of parallel lines 
across the corresponding response categories in the link 
functions was carefully examined to determine the model 
adequacy. Because the link functions were used to form 
the ordinal regression models under a strong assumption 
of parallel lines, any departures from this assumption 
might result in the incorrect analysis and conclusion 
(McCullagh, 1980). Furthermore, the contingency table 
showing the accuracy of the classification for the ordered 
categorical outcome was evaluated to determine which 
link function was superior. 

In order to interpret the ordinal regression model, 
researchers would first look at the signs of the regression 
coefficients. These signs give a great deal of insight into 
the effects of the explanatory variables on the ordinal 
outcome. The positive regression coefficient indicated that 
there was a positive relationship between the explanatory 
variable and the ordinal outcome. For the opposite direction, 
the negative regression coefficient indicated that there 
was a negative relationship between the explanatory variable 
and ordinal outcome. If the logit link (or cloglog link) was a 
choice of the modeling equation, the magnitude (e.g., odds 
or e*^) of the effect of a specific explanatory variable would 
be used to indicate that an average of one unit change on 
a specific explanatory variable affects on the change of the 
odds (or relative risk) of the event occurrence by a factor of 

e*^, holding other explanatory variables as constant. 
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Researchers need to be aware of the potential limitations 
in the study. Although the graduating student questionnaire 
data have been gathered during a three-year period for a 
small medical college, the sample size was still too small 
to yield the high power of the statistical test given that 
many explanatory variables entered the equation for 
analysis. Additionally, the item responses coded as zeroes 
for being “not applicable” were treated as missing values 
and excluded from the study. The large percent of cells 
with missing data could lead to an inaccurate chi-square 
test for the model fitting. Note that the model goodness- 
of-fit is usually dependent of chi square test results. 
However, if number of cells with zero value is large, the 
chi-squared goodness of fit statistics may not be 
appropriate (Agresti, 1990). Therefore, researchers are 
limited in how well they can assess the overall explanatory 
power of the models. Finally, the logit link and cloglog link 
in the ordinal regression analysis were not capable of 
selecting a subset of significant explanatory variables by 
automatic model building methods such as stepwise and 
back elimination procedures in SPSS command language. 
Therefore, the selection of explanatory variables in the 
model depended on the intuition from researchers and a 
trial and error approach described in the following two 
paragraphs. 

The model construction generally involves the use of 
the completed and the reduced models along with various 
link functions to create a pool of the candidate models. By 
examining one candidate model at a time, researchers 
should use the test of parallel lines as the fundamental 
step to assess the validity of the model assumption. 
Certain candidate models in a pool needed to be discarded 
if they failed to provide the evidence of satisfying the 
model assumption. Additionally, the model fitting statistics, 
e.g., pseudo R squares, and the accuracy of classification 
results should be used as criteria to screen the candidate 
models and choose the appropriate ones. When these 
sound appropriate models were chosen, researchers could 
temporarily eliminate a few observations or insignificant 
explanatory variables (say, one or two) on the questionnaire 
data to investigate if the modified models maintained their 
stability (e.g., model parameters slightly changed after 
the temporary elimination). If the modified models exhibited 
instability, they needed to be discarded immediately. 

Finally, the principle of parsimony should apply to the 
model construction. Webster’s dictionary defines 
parsimony as stinginess, meaning that if fewer explanatory 
variables are sufficient to explain the effects of the 
explanatory variables, the regression model does not 
need to include unnecessary variables. Based on the 
principle of parsimony, the reduced models that met the 
above screening criteria should be considered as the ideal 
models. However, without the automatic model building 
methods in SPSS package, the selection of “few” 
“important” explanatory variables to form the reduced 



models remain a challenging task for researchers. For 
instance, how did researchers decide which explanatory 
variables were important? The questionnaire items rated 
by the large percentage of student respondents expressing 
satisfaction (e.g., the most satisfactory — 90% or more) 
and dissatisfaction (e.g., the least satisfactory — 30% or 
more) might be fundamentally considered as “important” 
explanatory variables. Another question to be asked was 
‘How many important explanatory variables were needed 
in the reduced models?’ This is a case of not knowing how 
many underlying variables there are for the given data. 
Because a minimum ratio (e.g., 1 to 1 0) of the number of 
the explanatory variables to the sample size is 
recommended by a logistic regression study (Peng et al., 
2002), the number of explanatory variables could be 
determined by dividing 10 into the number of the 
questionnaires completed. 



SPSS PC Commands for Ordinal 
Regression Analysis 

Seven steps for the SPSS PC version 1 1 .0 commands 
were required to produce the ordinal regression model: 
Step 1 - Click Ana/yze, click Regression, and click Ordinal', 
Step 2 - Click over exp (dependent variable), and click 
<right arrow> sign to move it to the dependent box; Step 
3 - Hold down the CTRL key, click all independent variables, 
and click <right arrow> sign to move them to the covariates 
box; Step 4 - Click <down arrow> sign to display the 
ordinal regression - options and select Logit Link or 
Complementary Log-Log Link, then click continue; Step 5 
- Click Output button; select display — goodness of fit 
statistics, summary statistics, parameter estimates, test 
of parallel lines; Step 6 - Click Save variables — estimated 
response probability, predicted category, and click 
Continue; and Step 7 - Click OK. 

Study Results 

From 1 999 to 2001 , a total of 1 79 graduating medical 
students completed and returned the questionnaires with 
a response rate of 83% (1 79/21 6). The relative frequency 
distribution of all student satisfaction ratings was prepared. 
The student respondents were satisfied (50%) and very 
satisfied (45%) with the overall college experience. The 
majority of student respondents seemed to be satisfied 
with the college programs and services, regardless of 
gender and ethnic groups. The student respondents were 
most satisfied (i.e., top 10 item ratings in terms of the 
total percent of student respondents reported ‘satisfied’ 
and ‘very satisfied’) with accessibility to faculty, faculty 
competence, quality of instruction, student-faculty 
relations, health promotion/disease prevention, HIV/AIDS, 
medical ethics, clinical skills, communication skills, and 
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the bookstore. On the contrary, the student respondents 
were least satisfied (i.e., bottom 10 item ratings in terms 
of total percent of student respondents reported ‘satisfied’ 
and ‘very satisfied’) with career counseling, personal 
counseling, student recreation, computer-assisted 
instruction, computer skills, research methodology, video 
services, mail room services, housing facilities, and parking. 

The complete model analyzed 148 of the 179 
questionnaires and excluded 31 questionnaires from the 
study as a result of having at least one item with missing 
data or ‘not applicable’ rating. The study results for the 

Table 2 

Explanatory Variables Associated with the 
Overall College Satisfaction Based on the 
Complete Model with the Complementary Log- 
log Link 



Item 

Name 


Regression 

Coefficient 


P 

Value 




Item 

Name 


Regression 

Coefficient 


P 

Value 


Threshold (Category=2) 


11.895 


.000* 




Board Review Program 


.134 


.647 


Threshold (Category=3) 


17.579 


.000* 




Student Health Services 


-.062 


-.062 


Gender 


-.393 


.380 




Academic Counseling 


.481 


.067 


Ethnicity 


.915 


.069 




Personal Counseling 


.021 


.911 


Accessibility to the Dean 


.620 


.031* 




Career Counseling 


.171 


.405 


Accessibility to Faculty 


.760 


.040* 




Parking 


.139 


.606 


Faculty Competence 


.874 


.054 




Campus Security 


-.400 


.152 


Faculty Attitude Toward 
Students 


.897 


.047* 




Bookstore 


.208 


.605 


Bookstore 


.208 


.605 




Classroom Facilities 


.029 


.959 


Student-Faculty Relations 


1.176 


.023* 




Laboratory Facilities 


.596 


.254 


Grading System 


-.278 


.578 




Housing Facilities 


-.046 


.760 


Teaching Evaluation 


.031 


.961 




Student Recreation 


0.77 


.740 


Course Evaluation 


.461 


.446 




Cultural/Social Events 


-.011 


.970 


Scheduling of Classes 


.522 


.241 




Psychosocial Factors in 
Health/Illness 


.627 


.055 


Student-Administrator 

Relations 


-1.83 


.598 




Cultural Factors in 
Disease Development 


.714 


.133 


Admission & Registration 


.354 


.302 




Medical Ethics 


0.83 


.843 


Financial Aid Services 


-.059 


.794 




Health Promotion/ 
Disease Prevention 


1.013 


.060 


Mail Room Services 


-.206 


.288 




HIV/AIDS 


1.385 


.021* 


Library Services 


.323 


.194 




Clinical Skills 


.997 


.162 


Video Services 


.134 


.463 




Communication Skills 


-.578 


.155 


Computer-Assisted 

Instructions 


.145 


.478 




Interpersonal Skills 


.270 


.307 


Computer Services 


-.135 


.602 




Computer Skills 


-.304 


.197 


Tutorial Services 


.362 


.061 




Research Methodology 


.102 


.583 


* Significant difference from zero (p<.05) 



complete model containing all satisfaction items revealed 
a number of interesting findings. Within the complete 
models, the cloglog link was the better choice because of 
its satisfying ‘parallel lines’ assumption and larger model- 
fitting statistics, which will be discussed later. 

Using the complete model with the cloglog link. Table 
2 shows that the two thresholds of the model equation 
were significantly different from zero and substantially 
contributed to the values of the response probability in 
different categories. In addition, the satisfaction of the 
overall college experience was significantly associated 
with the five explanatory variables (e.g., accessibility to 



the dean; accessibility to faculty; faculty attitude toward 
students; student-faculty relations; and HIV/AIDS). These 
five significant explanatory variables exhibited positive 
regression coefficients, indicating that students who rated 
higher levels of satisfaction on these explanatory variables 
were likely to rate a higher satisfaction for the overall 
college experience. Of these five satisfaction items on the 
satisfaction of the overall college experience, 60 percent 
or three satisfaction items were related to faculty 
involvement — accessibility to faculty, faculty attitude toward 
students, student-faculty relations. Furthermore, none of 
the satisfaction items regarding facilities, support services, 
and leisure activities in college was significantly associated 
with the satisfaction of the overall college experience. 

Using the complete model with the logit link to build the 
ordinal regression model, the satisfaction of the overall 
college experience was found to be significantly associated 
with the six explanatory variables: ethnicity, accessibility 
to the dean, accessibility to faculty, student-faculty 
relations, health promotion/disease prevention, and HIV/ 
AIDS. However, because the complete model with the 
logit link failed to provide the evidence of satisfying ‘parallel 
lines’ assumption (i.e., convergence could not be attained 
according to the SPSS printout), the research findings 
mentioned above should be discarded. Therefore, it is 
unnecessary to prepare a table that contains item name, 
regression coefficient, and p value in this paper. 

The model-fitting statistic, namely the pseudo R square, 
measured the success of the model in explaining the 
variations in the data. The pseudo R square was calculated 
depending upon the likelihood ratio. For example, the 
McFadden’s R square compared the likelihood for the 
intercept only model to the likelihood for the model with 
the explanatory variables in order to assess the model 
goodness of fit. The interpretation of pseudo R square in 
the ordinal regression model was similar to that of the R 
square (e.g.. Coefficient of the Determination) in the linear 
regression model. The pseudo R square indicated that the 
proportion of variations in the outcome variable was 
accounted for by the explanatory variables. The larger the 
pseudo R square was, the better the model fitting was. 
The pseudo R squares for McFadden (.56), Cox and Snell 
(.60), and Nagelkerke (.75) in the complete model with the 
cloglog link were larger than those for McFadden (.49), 
Cox and Snell (.55), and Nagelkerke (.68) in the complete 
model with the logit link. 

The additional model fitting statistic, the Pearson’s chi- 
square, (x^ = 228.57 with d.f. of 242 and p = .723) for the 
complete model with the cloglog link indicated that the 
observed data were consistent with the estimated values 
in the fitted model. However, the Pearson’s chi-square 
test statistic 282.46 with d.f. of 242 and p = .038 for 
the complete model with the logit link indicated that the 
observed data were not consistent with the estimated 
values in the fitted model. Hence, the complete model with 
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the cloglog link was a better model as compared to the 
complete model with the logit link based upon the chi- 
square test results. 

The test of parallel lines was designed to make judgment 
concerning the model adequacy. The null hypothesis 
stated that the corresponding regression coefficients were 
equal across all levels of the outcome variable. The 
alternative hypothesis stated that the corresponding 
regression coefficients were different across all levels of 
the outcome variable. The chi-square test result = 
60.75 with d.f. of 44, and p = .08) indicated that there was 
no significant difference for the corresponding regression 
coefficients across the response categories, suggesting 
that the model assumption of parallel lines was not violated 
in the complete model with the cloglog link. However, as 
previously mentioned, the complete model with the logit 
link failed to provide the evidence of satisfying the 
assumption of parallel lines. 

The cross-tabulating method was used to categorize 
the classified and the actual responses into a 3 by 3 
classification table. Table 3 displays the accuracy of the 
classification results for the satisfaction response 

Table 3 

Accuracy of the Classification for Response 
Categories Based on the Complete Model with 
the Complementary Log-log Link 







Classified Response Category “ 










Dissatisfied 


Satisfied 


Very Satisfied 


Total 






2 


3 




5 




Dissatisfied 


40% 


60% 




100% 


Actual 




1 


58 


12 


71 


Response 

Category 


Satisfied 


1% 


82% 


17% 


100% 




Very 




10 


62 


72 




Satisfied 




14% 


86% 


100% 




Total 


3 


71 


74 


148 


“ Cross Tab 


in SPSS commands was used to tabulate oeli statistios 





categories. The complete model with the cloglog link 
classified the categories of “very satisfied” (86%), “satisfied” 
(82%), and “dissatisfied” (40%). The model demonstrated 
high prediction accuracy (82%) for all three categories 
combined. The classification results of the complete model 
with the logit link did not need to be presented in this 
paper because it was unable to perform the evidence of 
satisfying the test of the parallel lines. Also, the result of 
the chi-square test for the model fitting of the complete 
model with the logit link failed to indicate that the observed 
data were consistent with the estimated values in the 
fitted model. 

Similar to linear and logistic regression modeling 
techniques, the principle of parsimony was applicable to 
the construction of the ordinal regression model. The 



argument is that if the complete models containing all 
explanatory variables were too complex, it could result in 
inaccurate estimation of the parameters and instability of 
the model structure. Based on the above modeling strategy, 
the reduced models with the logit and cloglog links were 
constructed to include only the 20 explanatory variables — 
the top and the bottom 1 0-item ratings for the total percent 
of student respondents reported ‘satisfied’ and ‘very 
satisfied’. The reduced model analyzed 155 of the 179 
questionnaires and excluded 24 questionnaires from the 
study as a result of having at least one item with missing 
data or ‘not applicable’ rating. 

Table 4 shows that the result of the reduced model with 

Table 4 

Selected 20 Explanatory Variables^ 
Associated with the Overall College 
Satisfaction Based on the Reduced Model with 
the Logit Link 



Item 

Name 


Regression 

Coefficient 


P 

Value 




Item 

Name 


Regression 

Coefficient 


P 

Value 


Threshold (Category=2) 


5.782 


..001* 




HIV/AIDS 


.680 


.065 


Threshold (Category=3) 


10.020 


.000’ 




Housing Facilities 


-.164 


.153 


Bookstore 


.100 


.733 




Research Methodology 


-.048 


.708 


Health Promotion /Disease Prevention 


.674 


.033’ 




Career Counseling 


.159 


.294 


Clinical Skills 


.549 


.230 




Personal Counseling 


.042 


.772 


Accessibility to Faculty 


.482 


.096 




Computer Skills 


-.193 


.259 


Faculty Competence 


.700 


.026’ 




Parking 


.017 


.918 


Quality of Instruction 


.385 


.317 




Student Recreation 


.207 


163 


Medical Ethics 


-.061 


.842 




Video Services 


.069 


.609 


Communication Skills 


-.350 


.265 




Mail Room Services 


-.096 


.504 


Student-Faculty Relations 


1.291 


.000’ 




Computer-Assisted 

Instruction 


.089 


.525 


- top and bottom 1 0 item ratings in terms of the total percent of student respondents reported ’satisfied’ and 'very 
satisfied' 


’ Significant difference from zero (p<.05) 



the logit link, indicating the satisfaction of overall college 
experience was significantly affected by the satisfaction 
ratings of the three explanatory variables — health 
promotion/disease prevention, faculty competence, and 
student-faculty relations. 

The results of the reduced model (e.g., item name, 
regression coefficient, and p value) in the cloglog link did 
not need to be presented in the paper because the model 
assumption of parallel lines was violated. The model 
assumption of parallel lines in the reduced model with the 
logit link was not violated (e.g., = 25.567 with d.f. of 20 

and p = .18). In addition, the result of the Pearson’s chi- 
square test (x^ = 208.25 with d.f. of 276 and p = .999) 
indicated in the reduced model with the logit link that the 
observed data were consistent with the estimated values 
in the fitted model. Hence, the reduced model in logit link 
was a good model. The three pseudo R squares — 
McFadden (.37), Cox and Snell (.45), and Nagelkerke 
(.57) — were high for the reduced model in logit link. 
Furthermore, the accuracy of the classification results for 
the satisfaction response categories was shown in Table 5. 
The reduced model with the logit link classified the 
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Table 5 

Accuracy of the Classification for Response 
Categories Based on the Reduced Model with 
the Logit Link 







Classified Response Category • 










Dissatisfied 


Satisfied 


Very Satisfied 


Totai 






2 


3 




5 




Dissatisfied 


40% 


60% 




1 00% 


Actual 




1 


55 


18 


74 


Response 

Category 


Satisfied 


2% 


74% 


24% 


1 00% 




Very 

Satisfied 




17 


59 


76 






22% 


78% 


1 00% 




Total 


3 


75 


77 


155 


* Cross Tab in SPSS commands was used to tabulate ceil statistics 



categories of “very satisfied” (78%), “satisfied” (74%), and 
“dissatisfied” (40%). The modei demonstrated fairly high 
prediction accuracy (75%) for all three categories combined. 
If the principle of parsimony was considered to be the 
most important modeling strategy, then the reduced model 
with the logit link might be a better model when compared 
to the complete model with the cloglog link. The reduced 
model with the logit link appeared to be the best model in 
this study based on the model fitting statistics, the accuracy 
of classification results, and the principle of parsimony. 

Implications and Conclusion 

Numerous research findings were worthwhile to reiterate 
in this study. The reduced model with the logit link became 
the best model based on the screening criteria — the 
validity of model assumption, the fitting statistics (e.g.. 
Person’s chi-square and pseudo R squares), the accuracy 
of the classification results, the principle of parsimony, 
and the stability of parameter estimation. Therefore, 
needless to say, major research findings and implications 
should be drawn from the best model. 

The two explanatory variables related to the satisfaction 
of faculty involvement (i.e., faculty competence and 
student-faculty relations) were identified in the best model. 
Student satisfaction with faculty involvement significantly 
contributes to the probability of students expressing 
satisfaction with the overall college experience. It is 
expected that a small medical college with a low student- 
faculty ratio could lead to higher student satisfaction rating 
regarding faculty involvement. However, it provided the 
compelling evidence that faculty members have played a 
significant role in creating a pleasant environment influenced 
on student satisfaction for the overall college experience. 

In addition, the curriculum content regarding health 
promotion and disease prevention was significantly 
associated with the satisfaction of the overall college 
experience. It may provide evidence that one component 
of the medical curriculum has addressed the needs of 



medical students and contributed to the fulfillment of 
medical college goal, e.g., delivery of primary care through 
health promotion and disease prevention. The study 
suggested that the vast majority of student respondents 
expressed their satisfaction with faculty (e.g., faculty 
competence, student-faculty relations) and curriculum content 
(e.g., health promotion/disease prevention). The research 
findings in this study seemed to be identical to the previous 
study reported by the University of Michigan (Robins, et al, 
1997), where students strongly valued their learning 
environment especially with faculty. 

Overall, this study should be viewed as an important first 
step for the medical college to explore the relationship 
between the overall college satisfaction and multiple 
explanatory variables concerning faculty involvement, 
curriculum contents, support services, facilities, and leisure 
activities in college. The knowledge gained from this study 
would be beneficial to the medical college and its students. 
The goal was to obtain information from students to establish 
benchmarks that could be helpful to decision makers in 
medical college for improving medical education. For 
example, the medical college could pursue its ultimate goal 
of ensuring student satisfaction with the overall college 
experience by enhancing faculty involvement and curriculum 
contents. Medical students could ensure themselves 
participate in the quality of programs supported by the 
capable faculty and the adequate curriculum contents. 

In this study, the principle of parsimony along with 
various link functions was adopted to build the candidate 
models and to search for the best model. Much of the time 
and energy was devoted to developing candidate models, 
checking the model assumptions, assuring the model 
goodness of fit, and consequently selecting the best model 
for the medical college. The model building itself might be 
partly statistical methodology and partly experience and 
common sense of the researchers. The ordinal regression 
method provides a viable alternative to analyze student 
satisfaction data with the ordered categorical outcome. It 
does not treat an ordinal outcome as binary or dichotomous 
measure like logistic regression analysis, which may lead 
to the loss of information inherent. Also, it is not falsely 
assumed continuous measure and the properties of 
normality and constant variance for linear regression to 
analyze few categories of ordinal outcome, which may 
lead to incorrect analysis. Clearly, the ordinal regression 
modeling is a unique statistical technique in that the 
ordinal outcome variable is frequently encountered in the 
field of educational research and the model assumption of 
parallel lines is easily assumed and verified. This modeling 
technique is actually a practical tool that should be added 
to a practicing researcher’s toolkit. 

Summary 

It is convenient for some researchers to analyze ordinal 
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outcome by means of logistic and linear regression 
analyses. By altering the measuring scale of ordinal 
outcome, researchers are able to analyze data and produce 
research findings. However, the loss of information or 
incorrect analysis may have occurred in some cases. For 
instance, when the scale of outcome categories (e.g., 
very satisfied, satisfied, dissatisfied, and very dissatisfied) 
is arbitrarily collapsed into a binary measure (e.g., satisfied 
and dissatisfied), researchers are forced to use logistic 
regression analysis to analyze the two levels of ordinal 
outcome. By doing so, important information may be lost 
in the resulting model. Also, while few categories of 
ordinal outcome are treated as continuous measure, linear 
regression method is used to analyze the ordinal outcome 
that cannot be plausibly assumed normality and constant 
variance. Using linear regression method to analyze the 
ordinal outcome, researchers may produce incorrect 
estimation and interpretation based on the violation of 
model assumption. Therefore, if researchers wish to study 
the effects of explanatory variables on all levels of the 
ordered categorical outcome, an ordinal regression method 
must be appropriately chosen in order to obtain the valid 
research results. 

In this study, the ordinal regression method was used 
to model the relationship between the ordinal outcome 
variable, e.g., different levels of student satisfaction 
regarding the overall college experience, and the 
explanatory variables concerning demographics and 
student learning environment. The outcome variable for 
student satisfaction was measured on an ordered, 
categorical, and four-point Likert scale — ‘very dissatisfied’, 
‘dissatisfied’, ‘satisfied’, and ‘very satisfied’. Explanatory 
variables included two demographics, e.g., gender and 
ethnic groups, and 42 questionnaire items related to the 
satisfaction of faculty involvement, curriculum contents, 
support services, facilities, and leisure activities at the 
college. The research findings indicated faculty 
competence, student-faculty relations, and curriculum 
content regarding health promotion and disease prevention 
were significantly associated with the satisfaction of the 
overall college experience. Using the ordinal regression 
method, researchers could identify the significant 
explanatory variables with their control to enhance student 
satisfactions regarding college-learning environment. 

Essentially, the four sequential protocols are performed 
to create an ordinal regression model. First, the explanatory 
variables are examined to determine if they should be 
included in the model. Second, the outcome variable is 
coded in ordered, ranked, and categorical fashion. The 
explanatory variables are quantified by continuous and 
discrete measures. Third, the complete and the reduced 
models as well as the logit link and the complementary 
log-log (cloglog) link are used to produce the candidate 
models. The complete model contains all the explanatory 
variables in the model while the reduced model includes 



only a subset of the predetermined explanatory variables. 
Finally, the best model is chosen among all candidate 
models depending upon the model fitting statistics, the 
accuracy of the classification results, and the validity of 
the model assumption. 

Strengths 

The strengths of the ordinal regression model in this 
study are briefly described. First, many indicators 
concerning student learning outcome are frequently 
measured on an ordinal scale. For instance, course 
performances on a letter grade scale, (e.g.. A, B, C, and 
D) and satisfaction levels perceived by students on a 
Likert scale, (e.g., very satisfied, satisfied, dissatisfied, 
and very dissatisfied) are most appropriately measured by 
an ordinal scale. Thus, the ordinal regression model seems 
to have a broad marketplace to analyze diverse student- 
learning outcomes. Second, comparable to linear and 
logistic regression models, ordinal regression model can 
be used to perform the following tasks: (1) to identify 
significant explanatory variables that influence on the 
ordinal outcome; (2) to describe the direction of the 
relationship between the ordinal outcome and the 
explanatory variables; and (3) to perform classifications for 
all levels of the ordinal outcome, and subsequently evaluate 
the predict validity of the regression model. Third, various 
link functions such as logit and cloglog links are readily 
available to model the effect of the explanatory variables 
on the ordinal outcome. Fourth, the test of parallel lines 
can be easily used to assess the validity of the model 
assumption, and the model fitting statistics (e.g., -2log 
likelihood ratio and pseudo R squares) can be used as 
criteria to screen the candidate models and choose the 
most appropriate one. Finally, the model assumes that 
the relationship between the ordinal outcome and the 
explanatory variables is independent of the category. This 
assumption implies that the corresponding regression 
coefficients in the link function are equal for each cut-off 
point (Bender and Benner, 2000). Therefore, it is easy to 
construct and interpret the ordinal regression model, which 
requires only one model assumption, and produces only 
one set of regression coefficients. 

Indeed, the ordinal regression technique provides a 
viable alternative to analyze the ordinal outcome. It does 
not alter an ordinal outcome as binary or dichotomous 
measure for logistic regression analysis, which may lead to 
the loss of information inherent. Also, it does not falsely 
assume continuous measure and the properties of normality 
and constant variance for linear regression to analyze few 
categories of ordinal outcome, which may lead to incorrect 
analysis. Obviously, the ordinal regression modeling is a 
unique statistical technique in that the ordinal outcome is 
frequently encountered in the field of education and the 
model assumption of parallel lines is easily verified. 
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Weaknesses 

Researchers need to be aware of the limitations in 
using ordinal regression model. For instance, the “not 
applicable” responses of the satisfaction items (e.g., 
explanatory variables) are treated as missing values and 
excluded from the study. The large percent of cells with 
missing data could lead to a decrease of actual sample 
size for the model construction or an inaccurate chi- 
square test for the model fitting. Note that the model 
goodness-of-fit is usually dependent of chi-square test 
result. The chi-square test result normally depends on the 
sample size. Hence, if number of cells with a zero value 
is large, the chi-squared goodness of fit statistics may not 
be appropriate (Agresti, 1990). Thus, researchers are 
limited in how well they can assess the model goodness 
of fit. In addition, the logit link and cloglog link in the 
ordinal regression analysis are not capable of selecting a 
subset of significant explanatory variables by means of 
automatic model building methods such as stepwise and 
back elimination procedures in SPSS command language. 
Therefore, researchers are obliged to rely on their own 
intuition and experiences to select a subset of the important 
or significant explanatory variables in the model. As a 
result, much of the time and energy is devoted to developing 
candidate models, checking the model assumptions, and 
assuring the model goodness of fit. 

Major Alternative 

The ordinal regression model is strictly built based on 
the model assumption of parallel lines (e.g., equal 
regression coefficients) for all corresponding outcome 
categories. If the verification of model assumption fails, 
the multinomial logistic regression model that does not 
require the model assumption should become an alternative 
tool. The multinomial logistic regression model is an 
extension of binary logistic regression in that automatic 
model building methods are built in SPSS PC version 12.0 
commands. In multinomial logistic regression, the outcome 
variable is categorized as the nominal groups — the target 
groups and the reference group. For example, very 
dissatisfied rating is labeled as target group 1 ; dissatisfied 
rating is coded as target group 2; satisfied rating is 
considered as target group 3; and very satisfied rating is 
treated as the reference group. Three model equations are 
generated for the nominal outcome with the four categories. 
The three sets of relative risk are calculated when the 
probability of individual students falling into specific target 
category (j) is compared to those individuals being the 
reference category (k), e.g., P (Y=y^ / P (Y=y,^) (Plank and 
Jordan, 1997). The magnitude of the effect of a specific 
explanatory variable can be expressed as an average of 
one unit change on an explanatory variable affects on the 



change of the relative risk of individual students falling 
back the target category rather than advancing to the 
reference category. 

Editor’s Notes 

This article is an excellent reminder that there is life 
beyond linear and even logistic regression. It walks the 
researcher through some of the key decision points that 
are faced but because of the complexity of the topic, it 
should be seen as an introduction to the topic with additional 
work needed to use ordinal regression with comfort. 

The following are some notes. 

1 . As noted in the summary, the multinomial logit has 
less binding assumptions than the ordinal regression. 
In addition to being an alternative if the parallel 
slopes assumption is not met, it also is an 
appropriate choice if one does not feel comfortable 
with the ordinal assumption about the dependent 
variable. 



2. As the authors indicate, after the decision of the 
model, there is the decision of the linking function. 
They presented two of several alternatives but there 
are others such as the probit. The discussion of 
criteria both from concepts in the literature and from 
the empirical results is very helpful. At the same 
time, the presence of multiple linking functions 
indicates that users of ordinal regression will need 
to go beyond this introduction. 

3. How to handle the issue of missing data is 
always a problem. The authors handle this issue by 
dropping the cases. One alternative is related to the 
possibility of combining the individual items into the 
“factors” the authors provided and using an item 
average score for the items for which there were 
valid responses. This looses some of the detail but 
it also deals with the fact that for 45 items, about 2 
would be expected to be significant at the .05 level 
by chance. 

4. The results are presented well. The three general 
tests include the goodness of fit of the process to 
its assumptions - as in the test for parallel slopes, 
the overall test of the fit of the model - as in the 
proportion of the variation explained, and the ability 
to anticipate outcomes - as in the classification 
matrix. As is often the case some of these 
methodologies are approximations as the indicated 
from the need to use approximate coefficients of 
determination since variance of an ordinal variable 



11 



IR Applications, Number 1, Ordinal Regression.. 



can not be defined for an ordinal variable can not be 
defined as for an interval variable. 

5. There are also some additional advanced topics and 
aspects of the SPSS software such as the use of 
“factors” and “interactions” as well as the independent 
measures which are considered “covariates.” There 
are options on the iterative procedure used to 
calculate the equation. As noted above, there are 
multiple linking functions. Seeing these and other 
complexities as challenges, the greatest challenge 
will be when a colleague asks, “How did you 
calculate the equation” and a customer asks “ Now 
how do you interpret the results?” This article is an 
excellent introduction to start answering both 
questions. In cases where and interval dependent 
variable is not available, it gives us an example of 
what might be a much needed alternative to our 
current procedures. 
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